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The effect of heterogeneous sequence composition on the denaturation of double stranded DNA 
is investigated. The resulting pair-binding energy variation is found to have a negligible effect on the 
critical properties of the smooth second order melting transition in the simplest (Peyrard-Bishop) 
model. However, sequence heterogeneity is dramatically amplified upon adopting a more realistic 
treatment of the backbone stiffness. The model yields features of "multi-step melting" similar to 
those observed in experiments. 

PACS numbers: 87.15.Da, 68.35.Rh, 82.65.Y, 68.45.Gd 



The denaturation or melting of double-stranded DNA 
molecules upon changes in ambient temperatures or sol- 
vent conditions is a subject which has had a long his- 
tory ||,|. In early theoretical studies DNA melt- 
ing was described by the nearest-neighbor Id Ising model 
which yielded only a sharp crossover but not a thermo- 
dynamic phase transition Q . A smooth second order 
transition was later demonstrated in a modified version 
of the Id Ising model Q , after including an effective long- 
range interaction arising from the configurational entropy 
gain of the denatured segments. Experimentally, a pu- 
rified DNA sample containing an unique sequence and 
length is found to exhibit distinct multi-step melting, 
where the "melting curves" (to be specified below) ex- 
hibit sharp features consisted of plateaus of variable sizes 
separated by steps [§,|J . These fine features have been 
attributed to the melting of individual "domains" associ- 
ated with variations in the composition of the nucleotide 
sequences, since the binding energies of the two kinds of 
base pairs, adenine-thymine (AT) and guanine-cytosine 
(GC), are significantly different 0. The effect of binding 
energy variation has been studied previously using the 
nearest- neighbor Id Ising model In this paper, we 
investigate this effect in detail by incorporating the im- 
portant configurational entropy of the denatured strands 
in a systematic way. We find that sequence heterogene- 
ity itself is not sufficient to produce multi-step melting, 
with the phase transition remaining to be of the second- 
order. Nevertheless, heterogeneity can be dramatically 
amplified by small changes in the detailed form of the 
configurational entropy, leading to the apparent multi- 
step melting behavior for finite length sequences. 

A simple way of incorporating the entropy of the single- 
stranded segments is to model the two single strands i4 
and r„ by random walks || , with the index n denoting 
the n th base pair. The binding of complementary base 
pairs is described by a potential function V^rl 1 ^ — r^ 2 - 1 ), 
where V n has a hard core (reflecting the repulsion of the 
phosphate backbone) and an attractive (short-ranged) 
tail mimicking the hydrogen bond between each base 
pair. Further taking into account the directional speci- 



ficity of the hydrogen bonds, one obtains the following 
Hamiltonian for a double strand of N base pairs, 

1 N (K 1 

P H = t ^ 1 T {Vn+1 ~ Vn)2 + Vn{Vn) ■ (1) 

In Eq. (|l|), y n is the component of the relative displace- 
ment field r« — rl 2 " 1 along the direction of hydrogen bond 
and is the important degree of freedom we will focus on. 
The quadratic coupling describes the stiffness of the back- 
bone. 

The model (Q) without the ?i-dependence in the 
interaction V is known as the Peyrard-Bishop (PB) 
model |l(J. Although PB specifically used the Morse 
potential for V, the qualitative behavior of the system is 
well known for a large class of short-ranged potentials, 
via a mapping to a fictitious quantum mechanics prob- 
lem Oj. Let the depth of the (asymmetric) "potential 
well" V(y) be Uq and the range of attraction be a, then a 
continuous phase transition ]l^ ] occurs at a critical tem- 
perature T m , given by the condition T? n /(2Ka 2 ) ~ U . 
This phase transition is characterized by a discontinu- 
ity in the specific heat C and an algebraic divergence in 
the average separation distance t between the base pairs, 
I ~ (T m — T) _I/ with v = 1. The inverse of the pair dis- 
tance £ _1 is the order parameter of this transition and 
can be directly related (see below) to the fraction of un- 
broken base pairs, a key experimental observable . 

Next, we describe the effect of a variable interaction 
V n (y) which is fixed by the DNA sequence. For simplic- 
ity, we restrict y to the positive real axis, and model the 
interaction as a product of a short ranged function S a (y) 
[with S a (0 < y < a) « 1 and S a (y > a) — > 0] and a 
variable strength V n [Q. We assume the sequence to be 
random and short-range correlated, with an average pair- 
ing energy V n = —Uq, and the fluctuation SV n = V n — V 
described by its variance SV m SV n — A<5 m ,„. (Here, the 
overbar denotes average over the ensemble of random se- 
quences.) Some important conceptual issues to under- 
stand are whether the melting transition survives in the 
presence of the quenched-in variable interaction; and if 
so, what is the nature of the transition. 
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These issues have been studied in the past decade in 
the context of some closely related systems, e.g., one de- 
scribing the adsorption of a Gaussian random heteropoly- 
mer by a solid surface |l4|-|l8| (with y n being the distance 
of the n th monomer from the surface and V n (y) giving 
the interaction of that monomer with the surface), and 
another describing the wetting of a Id interface from a 
random substrate JTs| [2l]] . It is known that the melting 
transition still exists and the effect of quench-in random- 
ness is marginal in the renormalization group sense. This 
conclusion is straightforwardly reached, for instance, by 
considering the perturbative effect of a weak disorder 
on the melting temperature: Assuming that the effect 
of randomness is negligible in the small A limit, then 
fluctuation in base separation y n is correlated along the 
backbone up to the length £ ~ £ 2 . At a temperature T 
slightly below the melting temperature T rn of the pure 
system, £ ~ T^/(T m - T) 2 becomes very long. Variation 
in the interaction energy Uq averaged over the scale £ is 
SU ~ « \fE{T m - T)/T m , which leads to a shift 

in the melting temperature of the order 
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The effect of randomness is revealed by comparing ST m 
with ST = T m — T in the limit ST —> 0. Randomness is 
irrelevant if ST m <C ST, but is non-negligible if otherwise. 
The problem at hand is "marginal" since 5T m ~ ST. 

Much effort has been devoted to resolving whether the 
randomness is marginally relevant or marginally irrele- 
vant. Early studies JT^JT^] suggest that it is marginally 
irrelevant, such that critical properties of the melt- 
ing transition are the same as those of the pure case 
(up to logarithmic corrections). However, more recent 
renormalization-group studies 20 211] find the random- 



ness to be marginally relevant, indicating that the scaling 
properties should be different from the pure case beyond 
a crossover length £ x ~ exp[cU§/A] with c ~ 0(1), 
or equivalently if the reduced temperature is within the 

— 1/2 

Ginzburg temperature 5T X ~ £ x ' . The actual scaling 
behavior in the asymptotic "strong coupling" regime is 
however not accessible from these studies. 

The theoretical results have so far not been carefully 
tested numerically: In Ref. [[Hi, evidence in support of 
the irrelevancy of randomness was reported, while the 
contrary was claimed in Ref. ^0|. The numerics in the 
latter work was in fact rather qualitative; no attempt 
in characterizing the alleged strong coupling regime was 
made. The numerics in Ref. was flawed on the other 
hand by assuming a particular value for the melting tem- 
perature which was later shown to be incorrect |2(J . We 
have thus re-investigated the nature of the melting tran- 
sition numerically. 

As in Refs. Jl9| and ^0|, we use a transfer matrix cal- 
culation |24|] . We start with the transfer integral solution 
of the "wave function" (f> n (y) JlCfl , 



To speed up the numerics, we placed the recursion re- 
lation (||) on a lattice such that y takes on only non- 
negative integer values, from to L. We further re- 
stricted y n +i — Vn G {0,±1}- The potential function 
V n {y) has the simple form V n (y = 0) = — Uq ± y/~A with 
equal probability, and V n (y > 0) = 0. To character- 
ize the system, we compute (for each configuration of 
the random potential V n ) the "steady-state" probability 
distribution P n (y) of finding y n at y. This is obtained 
(up to normalization) as the product of the forward- and 
backward- propagated wave functions (see also Q ) . Suf- 
ficiently long segments (~ 10 4 ) close to the two ends of 
the sequence are truncated to ensure that the results are 
independent of the choice of boundary conditions. From 
the steady-state distribution, we compute the moments 
(Vri) l = ^2y=o y m Pn{y), and then average over different 
realization of the randomness (or average over n for very 
long sequences). For example, the average pair distance 
I is given by I = (y n ) L - 

We first identify the critical point by monitoring 
the L-dependence of the dimensionless variance, Sy — 

[{VnlL ~ (j/")!] 1 ^ 2 /^, w bich should be L-independent at 
the critical point. This is shown in the inset of Fig. 1(a) 
for systems with U = — 1, A = 1, K = 1 and N = 10 5 , 
averaged over 10 independent realization of random se- 
quences. The parameters are chosen such that £ x ~ 0(1) 
and the system is readily in the strong coupling regime. 
[Note also that the unit of n in our simplified discrete 
model is no longer one base pair. With the above param- 
eters, the length of our system corresponds to a sequence 
of several thousand base pairs of the PB model |1(J , with 
±20% pair-to-pair variation in the binding energy.] Us- 
ing the empirically obtained value of T m , we plot the pair 
distance I vs. the reduced temperature in Fig. 1(a). An 
exponent v — 1 is obtained, indicating the absence of 
anomalous scaling. To further test the relevancy of the 
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FIG. 1. (a) The average pair distance I for various trans- 
verse system sizes. The melting temperature T m w 3.07 is de- 
termined from the plot in the inset, (b) The "melting curves" , 
showing the fraction of unbroken pairs P(0). Scaling laws 
with v = 1 are shown by the dotted straight lines. 
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randomness, we directly measure the average fraction 
of unbroken pairs, -P n (0), which should scale as 1/1 ~ 
(T m — T) if the randomness is irrelevant. However, as 
pointed out in Ref. |p2| , relevancy of the randomness 
would imply additional singularity in the distribution 
P n (y —* 0), resulting in the anomalous scaling of P„(0). 
This quantity is plotted against the reduced tempera- 
ture in Fig. 1(b). Again, wc find no evidence of anoma- 
lous scaling. Since our calculations are performed in the 
strong-coupling regime, we conclude that cither the ran- 
domness is irrelevant or the asymptotic scaling in the 
strong-coupling regime is almost indistinguishable from 
the pure problem. Similar results have been obtained for 
a variety of different parameter choices. 

While the ultimate resolution to the issue of the rel- 
evancy of sequence heterogeneity may require yet larger 
systems with good statistics, it is clear from Fig. 1 that 
the melting transition encountered here is rather smooth. 
Indeed, the numerically obtained melting curve for a sin- 
gle sample is very smooth (see Fig. 3(b) below), without 
any noticeable fine structures. The smoothness of the 
transition is irrefutable even in the specific heat curves 
used to support the relevancy of randomness in Ref. p0[ . 
This makes the experimentally observed multi-step melt- 
ing behavior rather puzzling. To investigate the possible 
cause of multi-step melting, we recall a recent numeri- 
cal finding [^5| that fluctuation effect in DNA melting 
is much enhanced upon adopting a more realistic form 
of the backbone stiffness, to reflect the fact that the the 
DNA is significantly more rigid in the double-stranded 
conformation. An explicitly y -dependent stiffness 

-(3/n+B»+l)/2b (£\ 



K(y n ,y n 
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was used in Ref. |2j| to match the stiffness of the double 
strand K 2 (for y = 0) and the single strand K\ < K 2 
(for y — > oo). Numerical solution p5| of the homoge- 
neous version of ([}]) using the modified stiffness K(y' , y) 
yields what appears to be a first-order melting transition! 

Let us examine the effect of the variable stiffness in 
some detail. To facilitate the analysis, we modify the ex- 
ponential factor in (|4|) to e~ Vn l h . This does not cause any 
significant differences since y n » y n +i- With the modi- 
fied form of K(y), one can straightforwardly perform the 
transfer integral (^|) , which in the continuum limit yields 
a Schrodinger-like equation for <fi n (y), with an effective 
potential V n (y) = V n (y) + ~log[K(y)/Ki] + const, as 
well as an effective diffusion coefficient T/2K(y). Let us 
focus on the form of V: In addition to the original attrac- 
tive potential V n {y) of range a, there is now a repulsive 
term of the order Ui = (T/2) log{K 2 /K 1 ) > of range b. 
The latter plays the role of an (entropic) barrier and is 
the main effect introduced by the variable stiffness K(y). 
Qualitative features of the homogeneous system can be 
obtained by solving the Schrodinger equation using a con- 
stant stiffness K and a toy potential function V(y) on the 
half space y > 0, with V = — Uq + U\ for y < a, 
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FIG. 2. (a) Solution of the homogeneous PB model with 
a variable backbone stiffness. The order parameter l~ x still 
vanishes linearly with the reduced temperature, (b) The slope 
a has an exponential dependence on the ratio b/a. T m itself 
changed by ~ 50% over the range of b/a studied. 

V = +Ui for a < y < b, and V = for y > b. Wc 
find the order parameter l^ 1 of the homogeneous prob- 



lem still to have the form 



a ■ (T m - T)/T m near 



T m , although the amplitude a increases exponentially for 
increasing Ui or b/a. Our finding is verified numerically 
(Figs. 2) by exactly diagonalizing the transfer integral 
(||) for the homogeneous problem, using the Morse po- 
tential for V(y) and Eq. (@) for K(y,y') as in Ref. ||. 
Note that the width of the transition region scales as 1 /a. 
Thus, for the parameter value K 2 /K\ = 1.5 and b/a ~ 5 
used in Ref. Q (corresponding to a w 100 according 
to Fig. 2(b)), the transition region is extremely narrow, 
making it very much first-order-like in appearance. 

The heterogeneous system is studied next using the 
lattice transfer matrix algorithm, incorporating the toy 
potential V as described above, with Uq — > Uo±V~A. We 
used Ui = 0.2 and b/a = 3 such that the a-value of the 
homogeneous system is ~ 100 as in Ref. [f^sf . For such 
large values of en's, the melting curves for individual sam- 
ples display drastic multi-step behavior even for rather 
long sequences of N = 10 5 ; a typical example is shown in 
Fig. 3(a). The very smooth melting curve for the original 
heterogeneous model without barrier (a ~ 1) is shown in 
Fig. 3(b) for comparison. As in the pure PB model with 
entropic barrier, we expect that the heterogeneous model 
still undergoes a second order melting transition with 
self-averaging melting curve at sufficiently large scales. 
We can understand the sharp steps in Fig. 3(a) as result- 
ing from the first-order-like transition of various domains 
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FIG. 3. Melting curves for a single random sequence: (a) 
with and (b) without the entropic barrier (see text). 



3 



with different local transition temperatures T m , shifted 
by variations in the average local Uo along the sequence: 
Due to the small width of the transition region (~ 
a sequence length of 0(a 2 ) is necessary just to reduce 
the typical shift in T m down to the size of the transition 
region. Thus, the crossover length for the onset of self- 
averaging is expected to be a factor of 0(a 2 ) longer for 
the system with barrier. The exponential dependence 
of a on the barrier makes the multi-step feature easily 
observable for realistic sequence lengths. 

The domain structure are readily visualized by plotting 
the full probability distribution P n (y) at various temper- 
atures close to the melting point (Fig. 4). It is seen that 
isolated segments of the sequence unbind already at as 
much as 10% below the nominal transition temperature, 
indicating that the equilibrium configuration of the DNA 
consists of localized bubbles of denature regions. To test 
whether the experimentally observed multi-step features 
are indeed the result of an effective barrier induced by the 
variable stiffness, one needs to determine very accurately 
the form of K(y). 




FIG. 4. Density plot of the probability distribution P n {y) 
for a system with entropic barrier as in Fig. 3(a). Darker 
shades correspond to larger probability. 

To summarize, we have shown that variations in base- 
pair binding is by itself insufficient to generate the multi- 
step melting behavior for heterogeneous DNA strands. 
However, the inclusion of a variable backbone stiffness 
results in an entropic barrier which yields a sharp, first- 
order-like transition for the homogeneous system, and 
multi-step melting for the heterogeneous system. Asymp- 
totically, the transition is still expected to be second or- 
der; however, the crossover length is exponentially long. 
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